perm filename CHAP4[4,KMC] blob
sn#097514 filedate 1974-04-18 generic text, type T, neo UTF8
00100 LANGUAGE-RECOGNITION PROCESSES FOR UNDERSTANDING DIALOGUES
00200 IN TELETYPED PSYCHIATRIC INTERVIEWS
00300
00400 Since the behavior being simulated by this paranoid model is
00500 the sequential language-behavior of a paranoid patient in a
00600 psychiatric interview, the model (PARRY) must have an ability to
00700 interpret and respond to natural language input to a degree
00800 sufficient to demonstrate conduct characteristic of the paranoid
00900 mode. By "natural language" I shall mean ordinary American
01000 English such as is used in everyday conversations. It is still
01100 difficult to be explicit about the processes which enable humans to
01200 interpret and respond to natural language. ("A mighty maze ! but
01300 not without a plan." - A. Pope). Philosophers, linguists and
01400 psychologists have investigated natural language with various
01500 purposes. Few of the results have been useful to builders of
01600 interactive simulation models. Attempts have been made in artificial
01700 intelligence to write algorithims which "understand" teletyped
01800 natural language expressions. (Colby and Enea, 1967; Enea and
01900 Colby, 1973; Schank, Goldman, Rieger, and Riesbeck, 1973;
02000 Winograd, 1972; Woods, 1970). Computer understanding of natural
02100 language is actively being attempted today but it is not something to
02200 be completly achieved today or even tomorrow. For our model the
02300 problem was not to find immediately the best way of doing it but to
02400 find any way at all. We sought pragmatic feasibility, not instant
02500 optimality.
02600 During the 1960's when machine processing of natural language
02700 was dominated by syntactic considerations, it became clear that
02800 syntactical information alone was insufficient to comprehend the
02900 expressions of ordinary conversations. A current view is that to
03000 understand what information is contained in linguistic expressions,
03100 knowledge of syntax and semantics must be combined with beliefs from
03200 a conceptual structure capable of making inferences. How to achieve
03300 this combination efficiently with a large data-base represents a
03400 monumental task for both theory and implementation.
03500 Seeking practical performance, we did not attempt to
03600 construct a conventional linguistic parser to analyze conversational
03700 language of interviews. Parsers to date have had great difficulty in
03800 performing well enough to assign a meaningful interpretation to the
03900 expressions of everyday conversational language in unrestricted
04000 English. Purely syntactic parsers offer a cancerous proliferation
04100 of interpretations. A conventional parser, lacking neglecting and
04200 ignoring mechanisms, may simply halt when it comes across a word not
04300 in its dictionary. Parsers represent tight conjunctions of tests
04400 instead of loose disjunctions needed for gleaning some degree of
04500 meaning from everyday language communication. It is easily observed
04600 that people misunderstand and "ununderstand" at times and thus remain
04700 partially opaque to one another, a truth which lies at the core of
04800 human life and communication.
04900 How language is understood depends on how people interpret
05000 the meanings of situations in which they find themselves. In a
05100 dialogue, language is understood in accordance with a participant's
05200 view of the situation. The participants are interested in both what
05300 an utterance means (what it refers to) and what the utterer means (
05400 his intentions). In a first psychiatric interview the doctor's
05500 intention is to gather certain kinds of information; the patient's
05600 intention is to give information in order to receive help. Such an
05700 interview is not small talk; a job is to be done. Our purpose was to
05800 develop a method for recognizing sequences of everyday English
05900 sufficient for the model to communicate linguistically in a paranoid
06000 way in the circumscribed situation of a psychiatric interview.
06100 We did not try to construct a general-purpose algorithm which
06200 could understand anything said in English by anybody to anybody else
06300 in any dialogue situation. (Does anyone believe that it is possible?)
06400 The seductive myth of generalization can lead to trivialization.
06500 Therefore, we sought simply to extract some degree, or partial,
06600 idiosyncratic, idiolectic meaning (not the "complete" meaning,
06700 whatever that means) from the input. We utilized a
06800 pattern-directed, rather than a parsing-directed, approach because of
06900 the former's power to ignore irrelevant and unintelligible details.
07000 Natural language is not an agreed-upon universe of discourse
07100 such as arithmetic, wherein symbols have a fixed meaning for everyone
07200 who uses them. What we loosely call "natural language" is actually a
07300 set of history-dependent, selective, and interest-oriented idiolects,
07400 each being unique to the individual. (To be unique does not mean
07500 that no property is shared with other individuals, only that not
07600 every property is shared). It is the broad overlap of idiolects which
07700 allows the communication of shared meanings in everyday conversation.
07800 We took as pragmatic measures of "understanding" the ability
07900 (1) to form a conceptualization so that questions can be answered and
08000 commands carried out, (2) to determine the intention of the
08100 interviewer, (3) to determine the references for pronouns and other
08200 anticipated topics. This straightforward approach to a complex
08300 problem has its obvious drawbacks. We strove for a highly
08400 individualized idiolect sufficient to demonstrate paranoid processes
08500 of an individual in a particular situation rather than for a general
08600 supra-individual or ideal comprehension of English. If the
08700 language-recognition processes of PARRY were to interfere with
08800 demonstrating the paranoid processes, we would consider them
08900 defective and insufficient for our purposes.
09000 The language-recognition process utilized by PARRY first puts
09100 the teletyped input in the form of a list and then determines the
09200 syntactic type of the input expression - question, statement or
09300 imperative by looking at introductory terms and at punctuation. The
09400 expression-type is then scanned for conceptualizations, i.e. patterns
09500 of contentives consisting of words or word-groups, stress-forms of
09600 speech having conceptual meaning relevant to the model's interests.
09700 The search for conceptualizations ignores (as irrelevant details)
09800 function or closed-class terms (articles, auxiliaries, conjunctions,
09900 prepositions, etc.) except as they might represent a component in a
10000 contentive word-group. For example, the word-group (for a living) is
10100 defined to mean "work" as in "What do you do for a living?" The
10200 conceptualization is classified according to the rules of Fig. 1 as
10300 malevolent, benevolent or neutral. Thus PARRY attempts to judge the
10400 intention of the utterer from the content of the utterance.
10500 (INSERT FIG.1 HERE)
10600 Some special problems a dialogue algorithm must handle in a
10700 psychiatric interview will now be outlined along with a brief
10800 description of how the model deals with them.
10900
11000 QUESTIONS
11100
11200 The principal expression-type used by an interviewer is a
11300 question. A question is recognized by its first term being a "wh-" or
11400 "how" form and/or an expression ending with a question mark. In
11500 teletyped interviews a question may sometimes be put in declarative
11600 form followed by a question mark as in:
11700 (1) PT.- I LIKE TO GAMBLE ON THE HORSES.
11800 (2) DR.- YOU GAMBLE?
11900 Although a question-word or auxiliary verb is missing in (2), the
12000 model recognizes that a question is being asked about its gambling
12100 simply by the question mark.
12200 Particularly difficult are those "when" questions which
12300 require a memory which can assign each event a beginning, an end and
12400 a duration. Future versions of the model will have this capacity.
12500 Also troublesome are questions such as "how often", "how many", i.e.
12600 a "how" followed by a quantifier. If the model has "how often" on its
12700 expectancy list while a topic is under discussion, the appropriate
12800 reply can be made. Otherwise the model fails to understand.
12900 In constructing a simulation of symbolic processes it is
13000 arbitrary how much information to represent in the data-base, Should
13100 PARRY know which city is the capital of Alabama? It is trivial to
13200 store tomes of facts and there always will be boundary conditions. We
13300 took the position that the model should know only what we believed
13400 it reasonable to know relative to a few hundred topics expectable in
13500 a psychiatric interview. Thus PARRY performs poorly when subjected to
13600 baiting "exam" questions designed to test its informational
13700 limitations rather than to seek useful psychiatric information.
13800
13900 IMPERATIVES
14000
14100 Typical imperatives in a psychiatric interview consist of
14200 expressions like:
14300 (3) DR.- TELL ME ABOUT YOURSELF.
14400 (4) DR.- LET'S DISCUSS YOUR FAMILY.
14500 Such imperatives are actually interrogatives to the
14600 interviewee about the topics they refer to. Since the only physical
14700 action the model can perform is to "talk" , imperatives are treated
14800 as requests for information. They are identified by the common
14900 introductory phrases: "tell me", "let's talk about", etc.
15000
15100 DECLARATIVES
15200
15300 In this category is lumped everything else. It includes
15400 greetings, farewells, yes-no type answers, existence assertions and
15500 the usual predications.
15600
15700 AMBIGUITIES
15800
15900 Words have more than one sense, a convenience for human
16000 memories but a struggle for language-understanding algorithms.
16100 Consider the word "bug" in the following expressions:
16200 (5) AM I BUGGING YOU?
16300 (6) AFTER A PERIOD OF HEAVY DRINKING HAVE YOU FELT BUGS ON
16400 YOUR SKIN?
16500 (7) DO YOU THINK THEY PUT A BUG IN YOUR ROOM?
16600 In expression (5) the term "bug" means to annoy, in (6) it
16700 refers to an insect and in (7) it refers to a microphone used for
16800 hidden surveillance. PARRY uses context to carry out
16900 disambiguation. For example, when the Mafia is under discussion and
17000 the affect-variable of fear is high, the model interprets "bug" to
17100 mean microphone. In constructing this hypothetical individual we
17200 took advantage of the selective nature of idiolects which can have an
17300 arbitrary restriction on word senses. One characteristic of the
17400 paranoid mode is that regardless of what sense of a word the
17500 interviewer intends, the patient may idiosyncratically interpret it
17600 as some sense of his own. This property is obviously of great help
17700 for an interactive simulation with limited language-understanding
17800 abilities.
17900
18000 ANAPHORIC REFERENCES
18100 The common anaphoric references consist of the pronouns "it",
18200 "he", "him", "she", "her", "they", "them" as in:
18300 (8) PT.-HORSERACING IS MY HOBBY.
18400 (9) DR.-WHAT DO YOU ENJOY ABOUT IT?
18500 When a topic is introduced by the patient as in (8), a number
18600 of things can be expected to be asked about it. Thus the algorithm
18700 has ready an updated expectancy-anaphora list which allows it to
18800 determine whether the topic introduced by the model is being
18900 responded to or whether the interviewer is continuing with the
19000 previous topic.
19100 The algorithm recognizes "it" in (9) as referring to
19200 "horseracing" because a flag was set when horseracing was introduced
19300 in (8), "it" was placed on the expected anaphora list, and no new
19400 topic has been introduced. A more difficult problem arises when the
19500 anaphoric reference points more than one I-O pair back in the
19600 dialogue as in:
19700 (10) PT.-THE MAFIA IS OUT TO GET ME.
19800 (11) DR.- ARE YOU AFRAID OF THEM?
19900 (12) PT.- MAYBE.
20000 (13) DR.- WHY IS THAT?
20100 The "that" of expression (13) does not refer to (12) but to
20200 the topic of being afraid which the interviewer introduced in (11).
20300 Another pronominal confusion occurs when the interviewer uses
20400 "we" in two senses as in:
20500 (14) DR.- WE WANT YOU TO STAY IN THE HOSPITAL.
20600 (15) PT.- I WANT TO BE DISCHARGED NOW.
20700 (16) DR.- WE ARE NOT COMMUNICATING.
20800 In expression (14) the interviewer is using "we" to refer to
20900 psychiatrists or the hospital staff while in (16) the term refers to
21000 the interviewer and patient. Identifying the correct referent would
21100 require beliefs about the dialogue itself.
21200
21300 TOPIC SHIFTS
21400
21500 In the main, a psychiatric interviewer is in control of the
21600 interview. When he has gained sufficient information about a topic,
21700 he shifts to a new topic. Naturally the algorithm must detect this
21800 change of topic as in the following:
21900 (17) DR.- HOW DO YOU LIKE THE HOSPITAL?
22000 (18) PT.- IT'S NOT HELPING ME TO BE HERE.
22100 (19) DR.- WHAT BROUGHT YOU TO THE HOSPITAL?
22200 (20) PT.- I AM VERY UPSET AND NERVOUS.
22300 (21) DR.- WHAT TENDS TO MAKE YOU NERVOUS?
22400 (23) PT.- JUST BEING AROUND PEOPLE.
22500 (24) DR.- ANYONE IN PARTICULAR?
22600 In (17) and (19) the topic is the hospital. In (21) the topic
22700 changes to causes of the patient's nervous state.
22800 Topics touched upon previously can be re-introduced at any
22900 point in the interview. PARRY knows that a topic has been discussed
23000 previously because a topic-flag is set when a topic comes up.
23100
23200 META-REFERENCES
23300
23400 These are references, not about a topic directly, but about
23500 what has been said about the topic as in:
23600 (25) DR.- WHY ARE YOU IN THE HOSPITAL?
23700 (26) PT.- I SHOULD'NT BE HERE.
23800 (27) DR.- WHY DO YOU SAY THAT?
23900 The expression (27) is about and meta to expression (26). The model
24000 does not respond with a reason why it said something but with a
24100 reason for the content of what it said, i.e. it interprets (27) as
24200 "why shouldn't you be here?"
24300 Sometimes when the patient makes a statement, the doctor
24400 replies, not with a question, but with another statement which
24500 constitutes a rejoinder as in:
24600 (28 ) PT.- I HAVE LOST A LOT OF MONEY GAMBLING.
24700 (29 ) DR.- I GAMBLE QUITE A BIT ALSO.
24800 Here the algorithm interprets (29) as a directive to
24900 continue discussing gambling, not as an indication to question the
25000 doctor about gambling.
25100
25200 ELLIPSES
25300
25400
25500 In dialogues one finds many ellipses, expressions from which
25600 one or more words are omitted as in:
25700 (30) PT.- I SHOULDN'T BE HERE..
25800 (31) DR.- WHY NOT?
25900 Here the complete construction must be understood as:
26000 (32) DR.- WHY SHOULD YOU NOT BE HERE?
26100 Again, this is handled by the expectancy-anaphora list which
26200 anticipates a "why not".
26300 The opposite of ellipsis is redundancy which usually provides
26400 no problem since the same thing is being said more than once as in:
26500 (33 ) DR.- LET ME ASK YOU A QUESTION.
26600 The model simply recognizes (33) as a stereotyped pattern.
26700
26800 SIGNALS
26900
27000 Some fragmentary expressions serve only as directive signals
27100 to proceed, as in:
27200 (34) PT.- I WENT TO THE TRACK LAST WEEK.
27300 (35) DR.- AND?
27400 The fragment of (35) requests a continuation of the story introduced
27500 in (34). The common expressions found in interviews are "and", "so",
27600 "go on", "go ahead", "really", etc. If an input expression cannot be
27700 recognized at all, the lowest level default condition is to assume it
27800 is a signal and either proceed with the next line in a story under
27900 discussion or if a story has been exhausted, begin a new story with a
28000 prompting question or statement.
28100
28200 IDIOMS
28300
28400 Since so much of conversational language involves stereotypes
28500 and special cases, the task of recognition is much easier than that
28600 of linguistic analysis. This is particularly true of idioms.
28700 Whereas some idioms can be understood through analogy, most are a
28800 matter of rote-memory lookup. It is risky and time-consuming to to
28900 decipher what an idiom means from an analysis of its constituent
29000 parts. If the reader doubts this, let him ponder the following
29100 expressions taken from actual teletyped interviews.
29200 (36) DR.- WHAT'S EATING YOU?
29300 (37) DR.- YOU SOUND KIND OF PISSED OFF.
29400 (38) DR.- WHAT ARE YOU DRIVING AT?
29500 (39) DR.- ARE YOU PUTTING ME ON?
29600 (40) DR.- WHY ARE THEY AFTER YOU?
29700 (41) DR.- HOW DO YOU GET ALONG WITH THE OTHER PATIENTS?
29800 (42) DR.- HOW DO YOU LIKE YOUR WORK?
29900 (43) DR.- HAVE THEY TRIED TO GET EVEN WITH YOU?
30000 (44) DR.- I CAN'T KEEP UP WITH YOU.
30100 In people, the use of idioms is a matter of rote memory or
30200 analogy. In an algorithm, idioms can simply be stored as such. As
30300 each new idiom appears in teletyped interviews, its
30400 recognition-pattern is added to the data-base on the inductive
30500 grounds that what happens once can happen again.
30600 Another advantage in constructing an idiolect for a model is
30700 that it recognizes its own idiomatic expressions which tend to be
30800 used by the interviewer (if he understands them) as in:
30900 (45) PT.- THEY ARE OUT TO GET ME.
31000 (46) DR.- WHAT MAKES YOU THINK THEY ARE OUT TO GET YOU.
31100 The expression (45 ) is really a double idiom in which "out"
31200 means "intend" and "get" means "harm" in this context. Needless to
31300 say. an algorithm which tried to pair off the various meanings of
31400 "out" with the various meanings of "get" would have a hard time of
31500 it. But an algorithm which recognizes what it itself is capable of
31600 saying, can easily recognize echoed idioms.
31700
31800 FUZZ TERMS
31900
32000 In this category fall a large number of expressions which, as
32100 non-contentives, have little or no meaning and therefore can be
32200 ignored by the algorithm. The lower-case expressions in the following
32300 are examples of fuzz:
32400 (47) DR.- well now perhaps YOU CAN TELL ME something ABOUT
32500 YOUR FAMILY.
32600 (48) DR.- on the other hand I AM INTERESTED IN YOU.
32700 (49) DR.- hey I ASKED YOU A QUESTION.
32800 The algorithm has "ignoring mechanisms" which allow for an
32900 `anything' slot in its pattern recognition. Fuzz terms are thus
33000 easily ignored and no attempt is made to analyze them.
33100
33200 SUBORDINATE CLAUSES
33300
33400 A subordinate clause is a complete statement inside another
33500 statement. It is most frequently introduced by a relative pronoun,
33600 indicated in the following expressions by lower case:
33700 (50) DR.- WAS IT THE UNDERWORLD that PUT YOU HERE?
33800 (51) DR.- WHO ARE THE PEOPLE who UPSET YOU?
33900 (52) DR.- HAS ANYTHING HAPPENED which YOU DONT UNDERSTAND?
34000 One of the linguistic weaknesses of the model is that it
34100 takes the entire input as a single expression. When the input is
34200 syntactically complex, containing subordinate clauses, the algorithm
34300 can become confused. To avoid this, future versions of PARRY will
34400 segment the input into shorter and more manageable patterns in which
34500 an optimal selection of emphases and neglect of irrelevant detail can
34600 be achieved while avoiding combinatorial explosions.
34700 VOCABULARY
34800
34900 How many words should there be in the algorithm's vocabulary?
35000 It is a rare human speaker of English who can recognize 40% of the
35100 415,000 words in the Oxford English Dictionary. In his everyday
35200 conversation an educated person uses perhaps 10,000 words and has a
35300 recognition vocabulary of about 50,000 words. A study of telephone
35400 conversations showed that 96 % of the talk employed only 737 words.
35500 (French, Carter, and Koenig, 1930). Of course if the remaining 4% are
35600 important but unrecognized contentives,the result may be ruinous to
35700 the coherence of a conversation.
35800 In counting all the words in 53 teletyped psychiatric
35900 interviews conducted by psychiatrists, we found only 721 different
36000 words. Since we are familiar with psychiatric vocabularies and
36100 styles of expression, we believed this language-algorithm could
36200 function adequately with a vocabulary of at most a few thousand
36300 contentives. There will always be unrecognized words. The algorithm
36400 must be able to continue even if it does not have a particular word
36500 in its vocabulary. This provision represents one great advantage
36600 of pattern-matching over conventional linguistic parsing. Our
36700 algorithm can guess while a traditional parser must know with
36800 certainty in order to proceed.
36900
37000 MISSPELLINGS
37100
37200 Misspellings are common in teletyped interviews because (1)
37300 most people are not perfect spellers and (2) phone lines send the
37400 wrong characters to teletypes. One can defend against these errors by
37500 having a person monitor the conversation and type the correct
37600 spellings to PARRY.
37700 Future versions of the model will contain a dictionary of of
37800 common misspellings and utilize heuristic techniques (dropping and
37900 permuting characters) to achieve correct spelling forms.
38000
38100 META VERBS
38200
38300 Certain common verbs such as "think", "feel", "believe", etc.
38400 can take a clause as their ojects as in:
38500 (54) DR.- I THINK YOU ARE RIGHT.
38600 (55) DR.- WHY DO YOU FEEL THE GAMBLING IS CROOKED?
38700 The verb "believe" is peculiar since it can also take as
38800 object a noun or noun phrase as in:
38900 (56) DR.- I BELIEVE YOU.
39000 In expression (55) the conjunction "that" can follow the word
39100 "feel" signifying a subordinate clause. This is not the case after
39200 "believe" in expression (56). PARRY makes the correct
39300 identification in (56) because nothing follows the "you".
39400 ODD WORDS
39500 From extensive experience with teletyped interviews, we
39600 learned the model must have patterns for "odd" words. We term them
39700 such since these are words which are quite natural in the usual
39800 vis-a-vis interview in which the participants communicate through
39900 speech, but which are quite odd in the context of a teletyped
40000 interview. This should be clear from the following examples in which
40100 the odd words appear in lower case:
40200 (57) DR.-YOU sound CONFUSED.
40300 (58) DR.- DID YOU hear MY LAST QUESTION?
40400 (59) DR.- WOULD YOU come in AND sit down PLEASE?
40500 (60) DR.- CAN YOU say WHO?
40600 (61) DR.- I WILL see YOU AGAIN TOMORROW.
40700
40800
40900 MISUNDERSTANDING
41000
41100 It is perhaps not fully recognized by students of language
41200 how often people misunderstand one another in conversation and yet
41300 their dialogues proceed as if understanding and being understood is
41400 taking place.
41500 A funny example is the following man-on-the-street interview.
41600 INTERVIEWER - WHAT DO YOU THINK OF MARIHUANA?
41700 MAN - DIRTIEST TOWN IN MEXICO.
41800 INTERVIEWER - HOW ABOUT LSD?
41900 MAN - I VOTED FOR HIM.
42000 INTERVIEWER - HOW DO YOU FEEL ABOUT THE INDIANAPOLIS 500?
42100 MAN - I THINK THEY SHOULD SHOOT EVERY LAST ONE OF THEM.
42200 INTERVIEWER - AND THE VIET CONG POSITION?
42300 MAN - I'M FOR IT, BUT MY WIFE COMPLAINS ABOUT HER ELBOWS.
42400 Sometimes a psychiatric interviewer realizes when
42500 misunderstanding occurs and tries to correct it. Other times he
42600 simply passes it by. It is characteristic of the paranoid mode to
42700 respond idiosyncratically to particular word-concepts regardless of
42800 what the interviewer is saying:
42900 (62) PT.- SOME PEOPLE HERE MAKE ME NERVOUS.
43000 (63) DR.- I BET.
43100 (64) PT.- GAMBLING HAS BEEN NOTHING BUT TROUBLE FOR ME.
43200 Here one word sense of "bet" (to wager) is confused with the offered
43300 sense of expressing agreement. As has been mentioned, this
43400 sense-confusion property of paranoid conversation eases the task of
43500 simulation.
43600 UNUNDERSTANDING
43700
43800 A dialogue algorithm must be prepared for situations in which
43900 it simply does not understand. It cannot arrive at any interpretation
44000 as to what the interviewer is saying since no pattern can be matched.
44100 It may recognize the topic but not what is being said about it.
44200 The language-recognizer should not be faulted for a simple
44300 lack of irrelevant information as in:
44400 (65) DR.- WHAT IS THE FIFTIETH STATE?
44500 when the data-base does not contain the answer. In this default
44600 condition it is simplest to reply:
44700 (66) PT.- I DONT KNOW.
44800 When information is absent it is dangerous to reply:
44900 (67) PT.- COULD YOU REPHRASE THE QUESTION?
45000 because of the disastrous loops which can result.
45100 Since the main problem in the default condition of
45200 ununderstanding is how to continue, PARRY employs heuristics such as
45300 changing the level of the dialogue and asking about the interviewer's
45400 intention as in:
45500 (68) PT.- WHY DO YOU WANT TO KNOW THAT?
45600 or rigidly continuing with a previous topic or introducing a new
45700 topic.
45800 These are admittedly desperate measures intended to prompt
45900 the interviewer in directions the algorithm has a better chance of
46000 understanding. Although it is usually the interviewer who controls
46100 the flow from topic to topic, there are times when control must be
46200 assumed by the model.
46300 There are many additional problems in understanding
46400 conversational language but the description of this chapter should be
46500 sufficient to convey some of the complexities involved. Further
46600 examples will be presented in the next chapter in describing the
46700 logic of the central processes of the model.